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Abstract 

We  explore  the  correctness  of  the  Certified  Propagation  Algorithm  (CPA)  [5,  1,  7,  4]  in 
solving  broadcast  with  locally  bounded  Byzantine  faults.  CPA  allows  the  nodes  to  use  only 
local  information  regarding  the  network  topology.  We  provide  a  tight  necessary  and  sufficient 
condition  on  the  network  topology  for  the  correctness  of  CPA.  We  also  present  some  simple 
extensions  of  this  result. 
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1  Introduction 


In  this  work,  we  explore  fault-tolerant  broadcast  with  locally  bounded  Byzantine  faults  in  syn¬ 
chronous  point-to-point  networks.  We  assume  a  f -locally  bounded  model,  in  which  at  most  / 
Byzantine  faults  occur  in  the  neighborhood  of  every  fault-free  node  [5].  In  particular,  we  are  in¬ 
terested  in  the  necessary  and  sufficient  condition  on  the  network  topology  for  the  correctness  of 
the  Certified  Propagation  Algorithm  (CPA)  ~  the  CPA  algorithm  has  been  analyzed  in  prior  work 
[5,  1,  7,  4], 


Problem  Formulation.  Consider  a  network  of  n  nodes.  One  node  in  the  network,  called  the 
source  {s),  is  given  an  initial  input,  which  the  source  node  needs  to  transmit  to  all  the  other  nodes. 
The  source  s  is  assumed  to  be  fault-free.  We  say  that  CPA  is  eorrect,  if  it  satisfies  the  following 
properties,  where  Xg  denotes  the  input  at  source  node  s: 

•  Termination:  every  fault-free  node  i  eventually  decides  on  an  output  value  yi. 

•  Validity:  for  every  fault-free  node  i,  its  output  value  y*  equals  the  fault-free  source’s  input, 
i.e.,  yt  =  Xg. 

Related  Work:  Several  researchers  have  addressed  problems  similar  to  the  above  problem.  [5] 
studied  the  problem  in  an  infinite  grid.  [1]  developed  a  sufficient  condition  in  the  context  of  arbi¬ 
trary  topologies,  but  the  sufficient  condition  is  not  necessary.  [7]  provided  necessary  and  sufficient 
conditions,  but  the  two  conditions  are  not  identical  (not  tight).  [4]  provided  another  condition  that 
can  approximate  (within  a  factor  of  2)  the  largest  /  for  which  CPA  is  correct  in  a  given  graph. 


2  System  Model 

The  system  is  assumed  to  be  synchronous.  The  synchronous  communication  network  consisting  of 
n  nodes  including  source  node  s  is  modeled  as  a  simple  directed  graph  G{y,£),  where  V  is  the  set 
of  n  nodes,  and  £  is  the  set  of  directed  edges  between  the  nodes  in  V.  We  assume  that  n  >  2, 
since  the  problem  for  n  =  1  is  trivial.  Node  i  can  transmit  messages  to  another  node  j  if  and  only 
if  the  directed  edge  {i,j)  is  in  £.  Each  node  can  transmit  messages  to  itself  as  well;  however,  for 
convenience,  we  exclude  self-loops  from  set  £.  That  is,  (i,i)  0  £  for  i  £  V.  All  the  links  (i.e., 
communication  channels)  are  assumed  to  be  reliable.  With  a  slight  abuse  of  terminology,  we  will 
use  the  terms  edge  and  link  interchangeably. 

For  each  node  i,  let  N~  be  the  set  of  nodes  from  which  i  has  incoming  edges.  That  is,  N~  = 
{j  I  (j)*)  £  T}.  Similarly,  define  as  the  set  of  nodes  to  which  node  i  has  outgoing  edges.  That 
is,  =  {j  I  (b  j)  £  T}-  Nodes  in  N~  and  are,  respectively,  said  to  be  incoming  and  outgoing 
neighbors  of  node  i.  Since  we  exclude  self-loops  from  £,  i  ^  N~  and  i  0  N^.  However,  we  note 
again  that  each  node  can  indeed  transmit  messages  to  itself. 

We  consider  the  /-local  fault  model,  with  at  most  /  incoming  neighbors  of  any  fault-free  node 
becoming  Byzantine  faulty. 
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3  Certified  Propagation  Algorithm  (CPA) 


In  this  section,  we  describe  the  Certified  Propagation  Algorithm  (CPA)  from  [5]  formally.  Note 
that  the  faulty  nodes  may  deviate  from  this  specification. 

Source  node  s  commits  to  its  input  Xg  at  the  start  of  the  algorithm,  i.e.,  sets  its  output  equal 
to  Xg-  The  source  node  is  said  to  have  committed  to  Xg  in  round  0.  The  algorithm  for  each  round 
r,  r  >  0,  is  as  follows: 

1.  Each  node  that  commits  in  round  r  —  1  to  some  value  x,  transmits  message  x  to  all  its  outgoing 
neighbors,  and  then  terminates. 

2.  If  any  node  receives  message  x  directly  from  source  s,  it  commits  to  output  x. 

3.  Through  round  r,  if  a  node  has  received  messages  containing  value  x  from  at  least  /  +  1 
distinct  incoming  neighbors,  then  it  commits  to  output  x. 

3.1  The  Necessary  Condition 

For  CPA  to  be  correct,  the  network  graph  G{V,£)  must  satisfy  the  necessary  condition  proved  in 
this  section.  We  borrow  two  relations  ^  and  from  our  previous  paper  [9] . 

Definition  1  For  non-empty  disjoint  sets  of  nodes  A  and  B, 

•  A  ^  B  iff  there  exists  a  node  v  G  B  that  has  at  least  /  +  1  distinct  incoming  neighbors  in  A, 
i.e.,  |iV-  n  A|  >  /. 

•  A  ^  B  iff  A  ^  B  is  not  true. 

Definition  2  Set  F  GV  is  said  to  be  a  feasible  f  -local  fault  set,  if  for  each  node  v  ^  F,  F  contains 
at  most  f  incoming  neighbors  of  node  v.  That  is,  for  every  v  gV  —  F,  IN'”  n  F|  <  /. 

Theorem  1  Suppose  that  CPA  is  correct  in  graph  G{V,£)  under  the  f -local  fault  model.  Let  sets 
F,L,R  form  a  partition^  of  V ,  such  that  (i)  source  s  G  L,  (ii)  R  is  non-empty,  and  (in)  F  is  a 
feasible  f -local  fault  set,  then 

•  L  ^  R,  or 

•  R  contains  an  outgoing  neighbor  of  s,  i.e.,  Nf  n  i?  /  0. 

Proof:  Consider  any  partition  T,  L,  R  such  that  s  G  L,  R\s  non-empty,  and  T  is  a  feasible  /-local 
fault  set.  Suppose  that  the  input  at  s  is  Consider  any  single  execution  of  the  CPA  algorithm 
such  that  the  nodes  in  F  behave  as  if  they  have  crashed. 

By  assumption,  CPA  is  correct  in  the  given  network  and  faulty  behavior.  Thus,  all  the  fault-free 
nodes  commit  their  output  to  Xg.  Given  any  execution  of  CPA  under  the  above  behavior  by  the 
nodes  in  F,  consider  a  node  v  G  R  that  commits  its  output  to  Xg,  such  that  no  other  node  in  R 

^Sets  X\,X2,X3,  ...,Xp  are  said  to  form  a  partition  of  set  X  provided  that  (i)  Ui<i<pXi  =  X,  and  (ii)  XiDXj  —  O 
if  i  f  j. 
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commits  its  output  in  a  round  prior  to  u’s  commit.  Due  to  the  correctness  of  CPA,  such  a  node  v 
must  exist.  For  v  to  be  able  to  commit,  either  it  should  receive  the  message  Xg  directly  from  s,  or 
node  V  must  have  /  +  1  distinct  incoming  neighbors  that  have  committed  to  Xg.  By  assumption, 
nodes  that  have  committed  to  Xg  prior  to  v  must  all  be  in  set  L.  Thus,  either  (u,  s)  G  £,  or  node  v 
has  at  least  /  +  1  distinct  incoming  neighbors  in  set  L.  □ 

3.2  Sufficiency 

We  now  show  that  the  condition  in  Theorem  1  is  also  sufficient. 

Theorem  2  IfG{V,£)  satisfies  the  condition  in  Theorem  1,  then  CPA  is  correct  in  G{y,£)  under 
the  f -local  fault  model. 

Proof:  Suppose  that  G{V,£)  satisfies  the  condition  in  Theorem  1.  Let  F'  be  the  set  of  faulty 
nodes.  By  assumption,  F'  is  a  feasible  local  fault  set.  Let  Xg  be  the  input  at  source  node  s.  We 
will  show  that,  (i)  fault- free  nodes  do  not  commit  to  any  value  other  than  Xg,  and,  (ii)  until  all 
the  fault-free  nodes  have  committed,  in  each  round  of  CPA,  at  least  one  additional  fault-free  node 
commits  to  value  Xg.  The  proof  is  by  induction. 


Induction  basis:  Source  node  s  commits  in  round  0  to  output  equal  to  its  input  Xg.  No  other 
fault-free  nodes  commit  in  round  0. 

Induction:  Suppose  that  L  is  the  set  of  fault-free  nodes  that  have  committed  to  Xg  through  round 
r,  r  >  0.  Thus,  s  £  L.  Define  R  =  V  —  L  —  F' .  If  i?  =  0,  then  the  proof  is  complete.  Let  us  now 
assume  that  R 

Now  consider  round  r  -|-  1. 

Consider  any  fault-free  node  u  that  has  not  committed  prior  to  round  r  -|-  1  (i.e.,  u  £  R).  All 
the  nodes  in  L  have  committed  to  Xg  by  the  end  of  round  r.  Thus,  in  round  r  -|-  1  or  earlier,  node 
u  may  receive  messages  containing  values  different  from  Xg  only  from  nodes  in  F' .  Since  there  are 
at  most  /  incoming  neighbors  of  u  in  F' ,  node  u  cannot  commit  to  any  value  different  from  Xg  in 
round  r  -|-  1. 

By  the  condition  in  Theorem  1,  there  exists  a  node  w  in  R  such  that  (i)  node  w  has  an  incoming 
link  from  s,  or  (ii)  node  w  has  incoming  links  from  f-\-l  nodes  in  L.  In  case  (i),  node  w  will  commit 
to  Xg  on  receiving  Xg  from  node  s  in  round  r  -|-  1  (in  fact,  r  -|-  I  in  this  case  must  be  1).  In  case  (ii), 
since  all  the  nodes  in  L  from  whom  node  w  has  incoming  links  have  committed  to  Xg  (by  definition 
of  L),  node  w  will  be  able  to  commit  to  Xg  after  receiving  messages  from  at  least  /  -|-  1  incoming 
neighbors  in  L,  since  all  nodes  in  L  have  committed  to  Xg  by  the  end  of  round  r  by  the  definition 
of  L.^  Thus,  node  w  will  commit  to  Xg  in  round  r  -|-  1. 

This  completes  the  proof.  □ 


4  Discussion 

This  section  discusses  some  extensions  on  the  result  presented  above. 

^  Since  node  w  did  not  commit  prior  to  round  r  +  1,  it  follows  that  at  least  one  node  in  L  must  have  committed 
in  round  r. 
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4.1  Generalized  Fault  Model 


In  this  subsection,  we  briefly  discuss  how  to  extend  the  above  results  under  a  generalized  fault 
model.  The  generalized  fault  model  [8]  is  characterized  using  fault  domain  C  2^  as  follows: 
Nodes  in  set  F  may  fail  during  an  execution  of  the  algorithm  only  if  there  exists  set  F*  £  F  such 
that  F  F  F* .  Set  F  is  then  said  to  be  a  feasible  fault  set. 

Definition  3  Set  F  FV  is  said  to  be  a  feasible  fault  set,  if  there  exists  F*  G  F  such  that  F  F  F* . 

Please  refer  to  our  previous  work  [8]  for  more  discussion  on  generalized  fault  model. 

For  a  set  of  nodes  B,  define  N~{B)  =  {i  \  {i,j)  G  £,  i  ^  B,  j  G  B},  the  set  of  incoming 
neighbors  of  B. 

Definition  4  Given  F ,  for  disjoint  sets  of  nodes  A  and  B,  where  B  is  non-empty. 

•  AAb  iff  for  every  F*  G  F,  N-{B)  n  A  ^  F* . 

•  A^B  iff  aA  B  is  not  true. 


Under  the  generalized  fault  model,  step  3  of  CPA  needs  to  be  modified  as  follows.  Let  us  call 
the  modified  algorithm  CPA-G. 

3.  Through  round  r,  if  a  node  has  received  messages  containing  value  x  from  a  set  M,  where  M 
is  not  a  feasible  fault  set,  then  the  node  commits  to  value  x. 

It  is  easy  to  show  that  a  modified  version  of  Theorem  1  stated  below  holds  for  the  generalized 
fault  model. 

Theorem  3  Suppose  that  CPA-G  is  correct  in  graph  G(V,T)  under  the  generalized  fault  model. 
Let  sets  F,L,R  form  a  partition  ofV,  such  that  source  (i)  s  G  L,  (ii)  R  is  non-empty,  and  (Hi)  F 
is  a  feasible  fault  set,  then 

•  L  ^  R,  or 

•  R  contains  an  outgoing  neighbor  of  s,  i.e.,  Nf'  n  i?  /  0. 

4.2  Broadcast  Channel 

We  have  so  far  assumed  that  the  underlying  network  is  a  point-to-point  network.  The  results, 
however,  can  be  easily  extended  to  the  broadcast  or  radio  model  [5,  I]  as  well.  In  the  broadcast 
model,  when  a  node  transmits  a  value,  all  of  its  outgoing  neighbors  receive  this  value  identically. 
Thus,  no  node  can  transmit  mismatching  values  to  different  outgoing  neighbors.  Then,  it  is  easy 
to  see  that  the  same  condition  as  the  point-to-point  network  can  be  shown  to  be  necessary  and 
sufficient  for  of  CPA  under  the  broadcast  model  as  well. 

Now  consider  the  following  variation  of  the  CPA  algorithm:  if  the  outgoing  neighbors  of  source 
s  do  not  receive  a  message  from  s  in  round  I,  the  message  value  is  assumed  to  be  some  default 
value.  With  this  modification,  the  condition  in  Theorem  1  can  also  be  shown  to  be  necessary 
and  sufficient  to  perform  Byzantine  Broadcast  [6]  under  the  broadcast  model,  while  satisfying  the 
following  three  conditions  (allowing  s  to  be  faulty): 
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•  Termination:  every  fault-free  node  i  eventually  decides  on  an  output  value  Ui. 

•  Agreement:  the  output  values  of  all  the  fault-free  nodes  are  equal,  i.e.,  there  exists  y  such 
that,  for  every  fault-free  node  i,yi  =  y. 

•  Validity:  if  the  source  node  is  fault-free,  then  for  every  fault-free  node  i,  the  output  value 
equals  the  source’s  input,  i.e.,  y  =  Xg- 

The  proof  follows  from  the  proof  of  Theorem  1  and  the  observation  that  all  the  outgoing  neighbors 
of  s  receive  identical  value  from  s,  which  equals  its  input  Xg  when  s  is  fault-free. 

4.3  Asynchronous  Network 

In  our  analysis  so  far,  we  have  assumed  that  the  system  is  synchronous.  For  a  point-to-point 
network  with  fault-free  source  s,  it  should  be  easy  to  see  that  the  condition  in  Theorem  1  is  also 
necessary  and  sufficient  to  achieve  agreement  using  a  CPA-like  under  the  asynchronous  model  [2] 
as  well.  In  this  case,  the  algorithm  may  not  proceed  in  rounds,  but  a  node  still  commits  to  value 
X  either  on  receiving  the  value  directly  from  s,  or  from  f  +  1  nodes. 

This  claim  may  seem  to  contradict  the  FTP  result  [3].  However,  our  claim  assumes  that  the 
source  node  is  fault-free,  unlike  [3] . 


5  Conclusion 

In  this  paper,  we  explore  broadcast  using  the  CPA  algorithm  in  presence  of  Byzantine  faults.  In 
particular,  we  provide  a  tight  necessary  and  sufficient  condition  for  the  correctness  of  CPA. 


References 

[1]  V.  Bhandari  and  N.  H.  Vaidya.  On  reliable  broadcast  in  a  radio  network:  A  simplified  charac¬ 
terization.  Technical  report.  University  of  Illinois  at  Urbana-Champaign,  2005. 

[2]  D.  Dolev,  N.  A.  Lynch,  S.  S.  Pinter,  E.  W.  Stark,  and  W.  E.  Weihl.  Reaching  approximate 
agreement  in  the  presence  of  faults.  J.  ACM,  33:499-516,  May  1986. 

[3]  M.  J.  Fischer,  N.  A.  Lynch,  and  M.  S.  Paterson.  Impossibility  of  distributed  consensus  with 
one  faulty  process.  J.  ACM,  32:374-382,  April  1985. 

[4]  A.  Ichimura  and  M.  Shigeno.  A  new  parameter  for  a  broadcast  algorithm  with  locally  bounded 
byzantine  faults.  Inf.  Process.  Lett.,  110(12-13):514-517,  June  2010. 

[5]  C.-Y.  Koo.  Broadcast  in  radio  networks  tolerating  byzantine  adversarial  behavior.  In  Proc. 
23rd  Annual  ACM  Symp.  on  Principles  of  Distributed  COmputing  (PODC’  Of),  2004. 

[6]  M.  Pease,  R.  Shostak,  and  L.  Lamport.  Reaching  agreement  in  the  presence  of  faults.  J.  ACM, 
27(2):228-234,  Apr.  1980. 

[7]  A.  Pelc  and  D.  Peleg.  Broadcasting  with  locally  bounded  byzantine  faults.  Inf.  Process.  Lett., 
93(3):109-115,  Feb.  2005. 


6 


[8]  L.  Tseng  and  N.  H.  Vaidya.  Iterative  approximate  byzantine  consensus  under  a  generalized 
fault  model.  Technical  report,  CSL,  UIUC,  2012. 

[9]  N.  H.  Vaidya,  L.  Tseng,  and  G.  Liang.  Iterative  approximate  byzantine  consensus  in  arbitrary 
directed  graphs.  In  Proceedings  of  the  thirty- first  annual  ACM  symposium  on  Principles  of 
distributed  eomputing,  PODC  T2.  ACM,  2012. 


7 


